Skip to content

Conversation

@marlon-costa-dc
Copy link
Contributor

Summary

This PR implements complete metrics support for the Kotlin language, building on top of #1214 (tree-sitter 0.26.3 upgrade).

Changes:

  • Implement Checker trait for KotlinCode with AST node recognition
  • Implement Getter trait for KotlinCode with space kind and Halstead classification
  • Implement all 11 metric modules for Kotlin: ABC, Cognitive, Cyclomatic, Exit, Halstead, LOC, NArgs, NOM, NPA, NPM, WMC
  • Add traverse_children helper method to Node for multi-level AST traversal

Background

The upstream repository includes Kotlin in the language enum via mk_langs! macro, but uses implement_metric_trait! to provide stub implementations that return empty/zero values. This PR replaces those stubs with real implementations that properly analyze Kotlin AST nodes.

Implementation Decisions

1. AST Node Mapping (Checker)

Mapped tree-sitter-kotlin-codanna node types to the Checker trait methods:

Method Kotlin Node Types
is_comment LineComment, MultilineComment
is_func_space SourceFile, ClassDeclaration
is_func FunctionDeclaration
is_closure LambdaLiteral
is_string StringLiteral, LineStringLiteral, MultiLineStringLiteral
is_call CallExpression
is_non_arg FunctionBody, ModifierList, Statements (parameter containers excluded)

2. Halstead Classification (Getter)

Classified Kotlin tokens for Halstead metrics:

  • Operators: =, +=, -=, *=, /=, %=, &&, ||, !, ==, !=, <, >, <=, >=, +, -, *, /, %, ++, --, ?., ?:, !!, ::, ->, as, as?, is, !is, in, !in, .., ..<
  • Operands: Identifiers, literals (integer, real, boolean, character, string, null), this, super

3. Cognitive Complexity

Implemented nesting-aware complexity following the same pattern as Java:

  • Increments for: if, when, for, while, do-while, catch, &&, ||
  • Nesting penalty for control structures inside other control structures
  • Special handling for else if chains (no additional nesting penalty)

4. ABC Metric Helper Functions

Created Kotlin-specific helper functions for ABC metric calculation:

  • kotlin_count_unary_conditions: Counts negation operators in conditions
  • kotlin_inspect_container: Identifies class members vs local variables
  • compute_kotlin_args: Extracts parameter counts from function declarations

5. traverse_children Method

Added a utility method to Node for navigating multi-level child paths:

pub(crate) fn traverse_children<F>(&self, token_list: &[F]) -> Option<Node<'a>>
where
    F: FnOnce(u16) -> bool + Copy

This is used in NPA/NPM metrics to navigate from property/function declarations through modifier lists to find visibility modifiers.

6. WMC Computation

Extracted shared WMC computation logic into a reusable compute_wmc helper function, allowing consistent weighted method complexity calculation across languages.

Files Changed

File Changes
src/checker.rs +55 lines - Full Checker implementation for KotlinCode
src/getter.rs +50 lines - Full Getter implementation with Halstead classification
src/metrics/abc.rs +234 lines - ABC metric with Kotlin-specific helpers
src/metrics/cognitive.rs +40 lines - Cognitive complexity implementation
src/metrics/cyclomatic.rs +15 lines - Cyclomatic complexity implementation
src/metrics/exit.rs +10 lines - Return statement detection
src/metrics/halstead.rs +8 lines - Halstead metric implementation
src/metrics/loc.rs +25 lines - Lines of code metric
src/metrics/nargs.rs +24 lines - Number of arguments metric
src/metrics/npa.rs +49 lines - Number of public attributes
src/metrics/npm.rs +48 lines - Number of public methods
src/metrics/wmc.rs +23 lines - Weighted method complexity
src/node.rs +18 lines - traverse_children helper method

Test Plan

  • Verified compilation with cargo check
  • Verified all existing tests pass
  • Test with Kotlin source files to validate metric output accuracy

Dependencies

This PR depends on #1214 (tree-sitter 0.26.3 upgrade) which provides:

  • Updated tree-sitter-kotlin-codanna grammar
  • Language macro support for Kotlin

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings January 20, 2026 16:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements comprehensive metrics support for the Kotlin language by replacing stub implementations with full AST-based analysis. The implementation builds on a tree-sitter grammar upgrade (0.25.3 → 0.26.3) and switches from tree-sitter-kotlin-ng to tree-sitter-kotlin-codanna.

Changes:

  • Implements Kotlin-specific Checker and Getter traits for AST node recognition and Halstead classification
  • Adds complete implementations for all 11 metric modules (ABC, Cognitive, Cyclomatic, Exit, Halstead, LOC, NArgs, NOM, NPA, NPM, WMC)
  • Introduces traverse_children helper method for multi-level AST navigation in NPA/NPM metrics

Reviewed changes

Copilot reviewed 45 out of 46 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
Cargo.toml, enums/Cargo.toml Updated tree-sitter to 0.26.3, switched Kotlin grammar to tree-sitter-kotlin-codanna 0.3.9, upgraded other grammar versions
src/langs.rs, enums/src/languages.rs Updated Kotlin language binding from tree-sitter-kotlin-ng to tree-sitter-kotlin-codanna
src/macros.rs, enums/src/macros.rs Updated language macro to use new Kotlin grammar's language() function
src/checker.rs Implemented Checker trait for Kotlin with comment, function, closure, call, and string detection
src/getter.rs Implemented Getter trait with space kind identification and Halstead operator/operand classification
src/node.rs Added traverse_children helper for navigating multi-level child paths
src/metrics/abc.rs Implemented ABC metric with Kotlin-specific helper functions for condition counting
src/metrics/cognitive.rs Implemented cognitive complexity with nesting-aware control structure tracking
src/metrics/cyclomatic.rs Implemented cyclomatic complexity for Kotlin control flow
src/metrics/exit.rs Implemented return statement detection
src/metrics/halstead.rs Implemented Halstead metrics using generic compute_halstead function
src/metrics/loc.rs Implemented lines of code with comment and logical line detection
src/metrics/nargs.rs Implemented parameter counting with Kotlin-specific helper
src/metrics/npm.rs Implemented public method counting with visibility modifier traversal
src/metrics/npa.rs Implemented public attribute counting with visibility modifier traversal
src/metrics/wmc.rs Extracted shared compute_wmc helper, implemented for Kotlin
src/tools.rs Simplified parentheses in conditional expression
src/languages/language_*.rs Updated language enum mappings for Rust, Python, JavaScript due to grammar updates
tree-sitter-* Updated parser.h and parser.c files with @generated annotation, version bumps

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +702 to +703
| Kotlin::PIPEPIPE
| Kotlin::UnaryExpression
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The is_non_arg implementation includes PIPEPIPE (||) as a non-argument token, but this doesn't match the typical pattern from other language implementations which focus on delimiters and structural elements. The inclusion of UnaryExpression also seems unusual - other implementations typically only include punctuation tokens. This may cause issues with parameter counting in functions that use these operators.

Suggested change
| Kotlin::PIPEPIPE
| Kotlin::UnaryExpression

Copilot uses AI. Check for mistakes.
Comment on lines +557 to +560
let mut has_boolean_content = matches!(
node.parent().unwrap().kind_id().into(),
BinaryExpression | IfExpression | WhileStatement | DoWhileStatement | ForStatement
);
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of unwrap() on node.parent() could cause a panic if the node doesn't have a parent. While this follows the existing Java implementation pattern, it's safer to use proper error handling. Consider using node.parent()? or an if let pattern to handle the None case gracefully, especially since this is new code being added.

Suggested change
let mut has_boolean_content = matches!(
node.parent().unwrap().kind_id().into(),
BinaryExpression | IfExpression | WhileStatement | DoWhileStatement | ForStatement
);
let mut has_boolean_content = node
.parent()
.map(|parent| {
matches!(
parent.kind_id().into(),
BinaryExpression | IfExpression | WhileStatement | DoWhileStatement | ForStatement
)
})
.unwrap_or(false);

Copilot uses AI. Check for mistakes.
Comment on lines +567 to +587
let is_not_operator = matches!(node_kind, UnaryExpression)
&& matches!(node.child(0).unwrap().kind_id().into(), BANG);

// Stops the exploration if the node is neither
// a parenthesized expression nor a `Not` operator
if !is_parenthesised_exp && !is_not_operator {
break;
}

// Sets the flag to true if a `Not` operator is found
// This is used to prove if a variable or a value returned by a method is actually boolean
// e.g. `return (!x);`
if !has_boolean_content && is_not_operator {
has_boolean_content = true;
}

// Parenthesized expressions and `Not` operators nodes
// always store their expressions in the children nodes of index one
// Reference: tree-sitter-kotlin grammar for parenthesized_expression and prefix_expression
node = node.child(1).unwrap();
node_kind = node.kind_id().into();
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of unwrap() on node.child(0) and node.child(1) could cause a panic if the expected children don't exist. While this mirrors the Java implementation, it assumes a specific AST structure that may not always hold. Consider adding guards or using proper error handling to prevent potential panics.

Copilot uses AI. Check for mistakes.
add_cloc_lines(stats, start, end);
}
// Statement types for logical lines counting
ForStatement | WhileStatement | DoWhileStatement => {
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LOC implementation only counts ForStatement, WhileStatement, and DoWhileStatement as logical lines. This appears incomplete compared to other language implementations which typically count various statement types including if statements, return statements, assignments, etc. Consider adding more statement types to provide accurate logical line counts.

Suggested change
ForStatement | WhileStatement | DoWhileStatement => {
ForStatement
| WhileStatement
| DoWhileStatement
| IfExpression
| WhenExpression
| ReturnExpression
| ThrowExpression
| TryExpression
| BreakExpression
| ContinueExpression
| Assignment => {

Copilot uses AI. Check for mistakes.

pub(crate) fn child(&self, pos: usize) -> Option<Node<'a>> {
self.0.child(pos).map(Node)
self.0.child(pos as u32).map(Node)
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion from usize to u32 via pos as u32 could cause issues on platforms where usize is larger than u32, though this is unlikely in practice. However, this change should be verified to ensure it doesn't break existing code that relies on being able to pass larger indices. Consider adding a comment explaining why this cast is safe or using a checked conversion.

Suggested change
self.0.child(pos as u32).map(Node)
let idx = u32::try_from(pos).ok()?;
self.0.child(idx).map(Node)

Copilot uses AI. Check for mistakes.
// StringLiteral covers both line strings and multi-line strings in this grammar
RealLiteral | IntegerLiteral | HexLiteral | BinLiteral | CharacterLiteralToken1 | UniCharacterLiteralToken1
| LiteralConstant | StringLiteral | StringContent | LambdaLiteral | FunctionLiteral
| ObjectLiteral | UnsignedLiteral | LongLiteral | BooleanLiteral | CharacterLiteral => {
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Halstead operand classification appears to be missing the Identifier node type, which is present in other language implementations (Python, JavaScript, TypeScript, Rust, C++, Java). This means that variable names and function names won't be counted as operands in Halstead metrics calculations. Consider adding Identifier and SimpleIdentifier to the operand match pattern.

Suggested change
| ObjectLiteral | UnsignedLiteral | LongLiteral | BooleanLiteral | CharacterLiteral => {
| ObjectLiteral | UnsignedLiteral | LongLiteral | BooleanLiteral | CharacterLiteral
| Identifier | SimpleIdentifier => {

Copilot uses AI. Check for mistakes.
Comment on lines +719 to +746
// The child node of index 3 contains the `condition` when
// the initialization expression is a variable declaration
// e.g. `for ( int i=0; `condition`; ... ) {}`
if let Some(condition) = node.child(3) {
match condition.kind_id().into() {
SEMI => {
// The child node of index 4 contains the `condition` when
// the initialization expression is not a variable declaration
// e.g. `for ( i=0; `condition`; ... ) {}`
if let Some(cond) = node.child(4) {
match cond.kind_id().into() {
CallExpression | Identifier | True | False | SEMI | RPAREN => {
stats.conditions += 1.;
}
ParenthesizedExpression | UnaryExpression => {
kotlin_inspect_container(&cond, &mut stats.conditions);
}
_ => {}
}
}
}
CallExpression | Identifier | True | False => {
stats.conditions += 1.;
}
ParenthesizedExpression | UnaryExpression => {
kotlin_inspect_container(&condition, &mut stats.conditions);
}
_ => {}
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ForStatement handling in the ABC metric appears to assume Java/C-style for loops with initialization, condition, and update expressions at specific child indices (3 and 4). However, Kotlin's for loops follow a different syntax (for (item in collection)) and don't have the same structure. This implementation will likely not work correctly for Kotlin for-loops and may cause incorrect condition counting or potential panics if child indices don't exist.

Suggested change
// The child node of index 3 contains the `condition` when
// the initialization expression is a variable declaration
// e.g. `for ( int i=0; `condition`; ... ) {}`
if let Some(condition) = node.child(3) {
match condition.kind_id().into() {
SEMI => {
// The child node of index 4 contains the `condition` when
// the initialization expression is not a variable declaration
// e.g. `for ( i=0; `condition`; ... ) {}`
if let Some(cond) = node.child(4) {
match cond.kind_id().into() {
CallExpression | Identifier | True | False | SEMI | RPAREN => {
stats.conditions += 1.;
}
ParenthesizedExpression | UnaryExpression => {
kotlin_inspect_container(&cond, &mut stats.conditions);
}
_ => {}
}
}
}
CallExpression | Identifier | True | False => {
stats.conditions += 1.;
}
ParenthesizedExpression | UnaryExpression => {
kotlin_inspect_container(&condition, &mut stats.conditions);
}
_ => {}
// Java/C-style for-loops have semicolons in their header (e.g. `for (i = 0; condition; i++)`).
// Kotlin-style for-loops (`for (item in collection)`) do not have semicolons and also do not
// contain a boolean condition in the same sense, so they should not contribute to the C metric.
let mut has_semi = false;
for i in 0..node.child_count() {
if let Some(child) = node.child(i) {
if matches!(child.kind_id().into(), SEMI) {
has_semi = true;
break;
}
}
}
if has_semi {
// The child node of index 3 contains the `condition` when
// the initialization expression is a variable declaration
// e.g. `for ( int i=0; `condition`; ... ) {}`
if let Some(condition) = node.child(3) {
match condition.kind_id().into() {
SEMI => {
// The child node of index 4 contains the `condition` when
// the initialization expression is not a variable declaration
// e.g. `for ( i=0; `condition`; ... ) {}`
if let Some(cond) = node.child(4) {
match cond.kind_id().into() {
CallExpression | Identifier | True | False | SEMI | RPAREN => {
stats.conditions += 1.;
}
ParenthesizedExpression | UnaryExpression => {
kotlin_inspect_container(&cond, &mut stats.conditions);
}
_ => {}
}
}
}
CallExpression | Identifier | True | False => {
stats.conditions += 1.;
}
ParenthesizedExpression | UnaryExpression => {
kotlin_inspect_container(&condition, &mut stats.conditions);
}
_ => {}
}

Copilot uses AI. Check for mistakes.
Marlon Costa added 2 commits January 20, 2026 14:15
- Update tree-sitter from 0.25.4 to 0.26.3
- Switch tree-sitter-kotlin-ng to tree-sitter-kotlin-codanna 0.3.9
- Update tree-sitter-javascript to 0.25.0
- Update tree-sitter-python to 0.25.0
- Update tree-sitter-rust to 0.24.0
- Regenerate language enums for all grammars
- Fix Node::child() parameter type for tree-sitter 0.26 API
- Update tree-sitter from 0.25.4 to 0.26.3
- Switch tree-sitter-kotlin-ng to tree-sitter-kotlin-codanna 0.3.9
- Update tree-sitter-javascript to 0.25.0
- Update tree-sitter-python to 0.25.0
- Update tree-sitter-rust to 0.24.0
- Regenerate language enums for all grammars
- Fix Node::child() to use u32 cast for tree-sitter 0.26 API
- Add macro case for tree_sitter_kotlin_codanna::language()
- Implement Checker trait for KotlinCode with AST node recognition
- Implement Getter trait with space kind and Halstead classification
- Implement all 11 metric modules: ABC, Cognitive, Cyclomatic, Exit,
  Halstead, LOC, NArgs, NOM, NPA, NPM, WMC
- Add traverse_children helper method to Node for multi-level traversal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants